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Lysine formylation is a newly discovered post-translational 
modification (PTM) in histones and other nuclear proteins; it 
has a well-recognized but poorly defined role in chromatin 
conformation modulation and gene expression. To date, there 
is no general method to site-specifically incorporate N*-formyl- 
lysine at a defined site of a protein. Here we report the highly 
efficient genetic incorporation of the unnatural amino acid N*- 
formyllysine into proteins produced in Escherichia coli and 
mammalian cells, by using an orthogonal N*-formyllysine tRNA- 
synthetase/tRNA;y, pair. This technique can be applied to 
study the role of lysine formylation in epigenetic regulation. 


Post-translational modification (PTM) of histones is A) 
vital for the regulation of diverse biological process- 
es, including DNA replication, DNA repair, and main- 
tenance of genomic. stability." Aberrant histone 
modification leads to many diseases in human.”! To B) 


a : . nee marker -UAA  +UAA 
reveal the role of a specific PTM in histones, it is es- —— 
sential to produce homogeneous recombinant pro- 
tein that contains the site-specific PTM. Semisynthet- 
ic and enzyme-catalyzed methods have been devel- 
oped to realize site-specific histone PTM.P! Com- 48 — i 


pared to these methods, the genetic code expansion 
technique has the unique advantage that unnatural 
amino acids (UAAs) can be directly incorporated at 
specific sites of a protein of interest.” For example, 
three lysine PTMs (N*-acetyllysine (AcK), N*-crotonyl- 
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lysine, and N’-methyllysine) have been successfully genetically 
encoded.” Lysine formylation is a newly discovered PTM in his- 
tones and other nuclear proteins, and it is believed to be asso- 
ciated with oxidative stress under pathological conditions.” 
The formyl moiety can come from 3’-formylphosphate residues 
arising from 5’-oxidation of deoxyribose in DNA, caused by the 
enediyne neocarzinostatin. Only one methyl group shorter 
than Ack,” N‘-formyllysine (Fork) is not only structurally simi- 
lar, but also appears at the same sites of histones. This raises 
the question as to whether ForK and AcK have similar roles in 
chromatin structure modulation and gene expression. Howev- 
er, the lack of methods to site-specifically incorporate ForK into 
defined sites of proteins limits our ability to probe the function 


C) 


M -UAA 


+UAA 


Figure 1. Site-specific ForK incorporation. A) Chemical structure of Fork. B) SDS-PAGE 
analysis of myoglobin-TAG4-ForK expression in the presence (+UAA) or absence (—UAA) 
of 1 mm Fork. C) SDS-PAGE analysis of the expression of histone H3-TAG23 ForK in the 
presence (+UAA) or absence (—UAA) of 1 mm Fork. 


of Fork in vitro and in vivo. 

Here we report the highly efficient genetic incorporation of 
the UAA Fork (Figure 1A) into Escherichia coli and mammalian 
cells by using an orthogonal N*-formyllysine tRNA synthetase/ 
tRNAQY, pair. We found that ForkK-bearing proteins are not rec- 
ognized by anti-AcK antibody, thus indicating that the in vivo 
proteins bearing ForK might interact with other proteins differ- 
ently from those bearing AcK. 

To selectively incorporate ForK at defined sites in proteins, 
we performed three rounds of positive and two rounds of neg- 
ative selection with a Methanosarcina barkeri pyrrolysyl-tRNA- 
synthetase (MbPyIRS) library, in order to evolve an orthogonal 
tRNA/aminoacyl-tRNA synthetase pair that selectively charges 
ForK in response to the amber (TAG) codon, as previously de- 
scribed." ® The selected Fork-specific PyIRS (“ForKRS”) had five 
mutations: L266M, L2701, Y271F, L274A, and C313F. ForKRS was 
digested and two copies of ForKRS gene were ligated sequen- 
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Figure 2. ESI-MS analysis of myoglobin. A) Wild-type (calcd: 18354 Da; 
found: 18355.0 Da). B) Myoglobin-TAG4-ForK (calcd: 18423.0 Da; found: 


18422.0 Da). 
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tially into Sall/Bglll and Pst1/Nde1 restriction sites of the 
pEVOL vector” to generate pEVOL1. Then, Methanosarcina bar- 
keri pyrrolysyl-tRNAcya(pylT) was ligated into the ApaL1/Xho1 
sites of pEVOL1 to generate pEVOL-ForkRS, which was used to 
test the efficiency and selectivity of ForKRS, as previously re- 
ported.“ 

To determine whether Fork can be incorporated into a pro- 
tein with high efficiency and fidelity, an amber stop codon was 
substituted for Ser4 in sperm whale myoglobin to construct 
the myoglobin-TAG4 expression vector. Protein expression was 
carried out in E. coli in the presence of the selected synthetase 
(ForKRS), MbtRNA®” (pylT) and 1 mm Fork (or in the absence of 
ForK as a negative control). Analysis of the purified protein by 
SDS-PAGE showed that full-length myoglobin was expressed 
only in the presence of Fork (Figure 1B), thus indicating that 
ForKRS recognizes ForK specifically. ESI-MS analysis of myoglo- 
bin-TAG4-ForK gave an observed average mass of 18422.0 Da 
(Figure 2), which was in agreement with the calculated mass 
(18423 Da). This indicates that Fork had been incorporated at 
the defined site of myoglobin. ForK was also incorporated into 
the position 23 of histone H3 (Figure 1C); this has been identi- 
fied as a lysine formylation and acetylation site." The yield of 
“H3-TAG23-Fork” was 2 mgL7!, which is comparable to yields 
reported for the incorporation of other UAAs with the pylRS/ 
MbtRNAPY), pair. 

To examine whether ForK is recognized by an anti-AcK anti- 
body, an amber stop codon was substituted for Lys3 and 
Tyr151 in superfolder green fluorescent protein (sfGFP) to con- 
struct the sfGFP-TAG3 and sfGFP-TAG151 expression vectors, 
respectively. AcK or ForK was incorporated into position 151 of 
sfGFP, and analyzed by western blotting. sfGFP-TAG3-AcK was 
detected efficiently by an anti-AcK antibody; sfGFP-TAG3-ForK 
was not (Figure 3). This implies that lysine formylation PTM is 


1 2 3 


anti-Ack 


Figure 3. Western blot analysis of wild-type and mutant sfGFP: lane 1) wild- 
type sfGFP; lane 2) sfGFP-TAG3-AcK; lane 3) sfGFP-TAG3-ForK. The samples 
were probed by using anti-his tag and anti-N*-acetyllysine antibodies. 


a significantly different to lysine acetylation. 

As the pyrrollysine tRNA synthetase/tRNA pair is orthogonal 
in both bacterial and mammalian cells," we tested whether 
the ForKRS and MbtRNA®!, pair could perform site-specific 
incorporation of ForK in mammalian cells. We cloned ForKRS 
into the pCMV-NBK-1 vector” to generate pCMV-ForkRS, in 
which transcription of MbtRNA®', is under the control of the 
human U6 promoter and the expression of ForKRS is driven by 
the CMV promoter. Plasmids pSwan-EGFP37TAG™ and pCMV- 
ForKRS were co-transfected into 293T human embryonic 
kidney cells. The cells were grown in the absence or presence 
of 1mm Fork for 36h. EGFP fluorescence was observed only 
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Figure 4. Genetic incorporation of ForK at position 37 of EGFP in 293T cells 
by using the ForKRS/MbtRNAQ”), pair. The EGFP fluorescence is observed 
only in the presence of 1 mm Fork. 


for cells grown in the presence of ForK (Figure 4), thus indicat- 
ing that when using the ForKRS/MbtRNAQY), pair, Fork was site- 
specifically introduced at position 37 of EGFP through amber 
codon suppression in mammalian cells. 


Conclusion 


We have demonstrated the highly efficient genetic incorpora- 
tion of ForK into E. coli and human cells. The ForK-incorporated 
protein was not recognized by anti-AcK antibody, thus indicat- 
ing that proteins bearing PTM lysine formylation might interact 
with other proteins differently from those bearing lysine acety- 
lation."*"" Lysine formylation was also demonstrated in his- 
tone at the sites for lysine acetylation and methylation. This in- 
dicates that lysine formylation can block lysine acetylation and 
methylation in vivo. Lysine formylation can also negatively reg- 
ulate nucleosome assembly. In our work, ForK was site-spe- 
cifically incorporated into histones for the first time. This new 
technology could be powerful tool to elucidate the cellular 
functions of lysine formylation, caused by oxidative stress in 
cells. An anti-ForK antibody is not commercially available. By 
expressing homogeneous recombinant proteins bearing Fork 
at a specific position, we are currently working towards pro- 
ducing antibodies for the recognition of Fork. 
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